Evaluation of Clinical Text Segmentation to Facilitate Cohort Retrieval
نویسندگان
چکیده
منابع مشابه
NetEase Automatic Chinese Word Segmentation
This document analyses the bakeoff results from NetEase Co. in the SIGHAN5 Word Segmentation Task and Named Entity Recognition Task. The NetEase WS system is designed to facilitate research in natural language processing and information retrieval. It supports Chinese and English word segmentation, Chinese named entity recognition, Chinese part of speech tagging and phrase conglutination. Evalua...
متن کاملDocument Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملTools and methods for objective or contextual evaluation of topic segmentation
In this paper we discuss the way of evaluating topic segmentation, from mathematical measures on variously constructed reference corpus to contextual evaluation depending on different topic segmentation usages. We present an overview of the different ways of building reference corpora and of mathematically evaluating segmentation methods, and then we focus on three tasks which may involve a top...
متن کاملChinese Word Segmentation and Information Retrieval
In this paper we present results of experiments with Chinese word segmentation and information retrieval. Our experiments with three different word segmentation algorithms indicate that accurate segmentation measurably improves retrieval performance. We discuss the evaluation of word segmentation algorithms for the purpose of better indexing segmented texts for retrieval.
متن کاملRecall is the Proper Evaluation Metric for Word Segmentation
We extensively analyse the correlations and drawbacks of conventionally employed evaluation metrics for word segmentation. Unlike in standard information retrieval, precision favours under-splitting systems and therefore can be misleading in word segmentation. Overall, based on both theoretical and experimental analysis, we propose that precision should be excluded from the standard evaluation ...
متن کامل